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I.  INTRODUCTION 

*  A  convenient  model  for  learning  is  provided  by  the  sequential  com¬ 
pound  decision  problem  of  mathematical  statistics- (see,  e.g. ,  [5]).  The 
decision-maker  observes  a  sequence  of  independent  random  variables^  (sam¬ 
ples)  X^,  the  distribution  of  which  varies  arbitrarily  along  the  se¬ 
quence.  The  decision-maker  is  required  to  make  a  sequence  of  decisions 

d  from  a  given  set  of  decisions  D  incurring  a  loss  L( 0  ,d  )  if  his 
n  n  n 

decision  at  the  n-th  step  was  d  and  the  sample  X  was  distributed 

n  n 

according  to  Vg  ,  where  0n  6  ®  is  a  parameter  of  the  family  of  dis¬ 
tributions  {vg  :  0e0).  The  observation  thus  conveys  some  infor¬ 

mation  about  the  parameter  0^.  However,  since  the  decision-maker  does 
not  know  the  distribution  of  Xq  beforehand,  he  tries  to  learn  during 
the  sequence  how  to  utilize  this  information  to  minimize  hi3  losses.  ^ 

A  natural  criterion  of  his  performance  is  the  average  loss  during 
the  sequence  or  the  expectation  of  the  latter  (average  risk).  It  has 
been  argued  (see,  e.g.,  [8])  that,  in  a  sense,  the  beBt  he  can  expect 
is  to  reduce  this  quantity  to  the  Bayes  risk  p  of  the  underlying 
generic  decision  problem  evaluated  for  the  hypothetical  prior  distribu¬ 
tion  on  6  equal  to  the  empirical  distribution  of  the  sequence  of  param¬ 
eters  {9  ).  Therefore,  the  decision-maker  should  try  to  reduce  the 
n 

excess  of  the  average  risk  over  the  corresponding  Bayes  risk — the  so- 
called  regret  — to  zero  as  the  index  of  the  sequence  n  -*  +«.  More¬ 
over,  since  the  parameter  0  €0  may  vary  arbitrarily  with  n,  he  should 

do  this  uniformly  in  all  sequences  (0  }  . 

n 

Several  papers  have  been  concerned  with  the  question  of  finding  a 
decision  rule  for  the  decision-maker  to  achieve  this  goal.  If  we  leave 
aside  those  in  which  some  restrictions  were  imposed  on  the  possible 
sequences  of  parameters  (usually  assuming  that  ( ©n )  is  a  sequence  of 
i.i.d.  random  variables),  we  can  divide  these  works  according  to  the 
assumptions  made  about  the  information  available  to  the  decision-maker. 
This  information  may  be 
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(l)  The  initial  information  concerning  the  data  of  the  generic 

decision  problem,  viz.,  the  family  of  distributions  {v  :  0  t  ©] , 

0 

the  loss  function  L,  and  properties  of  the  sample  space  E 
and  the  sets  ©,  D. 


(2)  The  learned  information  received  gradually  during  the  sequence 
of  decisions.  Here,  we  can  distinguish  three  main  cases: 

(a)  after  each  decision  d  is  made,  the  decision-maker  is 

n 

told  the  value  of  the  parameter  0  ; 

n 

(b)  after  each  decision  d  ,  the  decision-maker  observes  a 

n 

random  estimate  of  the  value  6  ; 

n 

(c)  the  decision-maker  is  not  told  anything  and  has  to  rely 

only  on  observations  of  the  samples  X  . 

n 

Thus,  for  example,  the  case  (2a)  has  been  studied  by  J.  Hannan  [3], 
the  case  (2b)  by  S.  Jllovec  and  the  author  [2],  the  case  (2c)  by  E.  Samuel 
[7]  and  J.  Van  Ryzin  [5],  all  assuming  complete  initial  information  (all 
the  data  known).  Of  those  assuming  only  partial  initial  information, 

namely  the  family  {v  :  9  e 8)  not  known,  let  us  mention  J.  Van  Ryzin 

0 

[6]  for  the  case  (2a)  and  N.  Alens  and  T.  M.  Cover  [l]  for  the  case  (2c), 
the  latter,  however,  for  a  slightly  modified  problem  (compound  decision 
problem  only). 

In  this  paper,  we  will  make  a  completely  different  pair  of  assumptions, 
which,  in  our  opinion,  may  be  more  appropriate  for  some  learning  models. 

We  will  assume: 


(1)  The  decision-maker  knows  only  his  own  decision  space  D  and  the 

sample  space  E.  He  knows  nothing  about  the  family  [v  :  0  e©} 

0 

or  the  set  ©  (which  may  be  infinite),  nor  does  he  know  the 
loss  function  L.  The  loss  function  itself  may,  moreover,  be 
random  with  an  unknown  distribution  depending,  of  course,  on 
6  and  d. 

(2)  After  each  decision  d  is  made,  the  decision-maker  is  told 

'  n 

the  value  of  the  random  loss  incurred  by  him. 


We  are  going  to  define  a  decision  rule  based  on  these  assumptions 
and  show  that  the  resulting  regret  goes  uniformly  to  zero.  We  will  do 
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this  for  the  two-decision  case  D  =  {1,2}  and  the  random  loss  L  uni¬ 
formly  bounded;  the  result,  however,  readily  extends  to  the  case  of  D 
finite  and  L  with  uniformly  bounded  third  moment  (see  [9]  for  the  case 
of  a  game  situation). 

The  decision  rule  suggested  here  is  simple  and  more  or  less  indi¬ 
cated  by  intuition.  First,  the  sequence  of  indices  n  is  divided  into 

blocks  J  of  increasing  length  and  a  net  of  countable  partitions  ® 
k  k 

of  the  sample  space  is  defined.  Before  each  decision  d^#  a  coin  is 
flipped  with  probability  of  a  head  p  if  n€  J  (p  -*  0  as  k  -♦  +<»  ). 

K  K  K 

The  outcome  of  the  toss  determines  whether  the  n-th  step  will  be  a  test 

step  (to  gain  information)  or  an  active  step  (to  minimize  the  loss).  At 

test  steps  (when  the  outcome  is  a  head),  either  decision  is  taken  by 

random  with  equal  probability.  After  the  loss  is  learned,  the  quantity 

^  =  ±(l  +  L  )  is  computed  with  plus  sign  if  d  =  1  and  minus  sign  if 

n  n  n 

d  =  2.  t  as  well  as  the  sample  X  are  remembered.  At  active  steps 
n  n  n 

within  the  block  J  ,  the  decision  d  =  1  or  2  is  taken  according  to 

k  n 

whether  the  sum  of  the  ^'s  over  all  the  past  test  steps  in  the  same 
block  J  ,  at  which  the  sample  X  fell  into  the  same  set  of  the  parti- 

K  A 

tion  as  the  sample  X^  just  observed,  is  positive  or  negative. 

Loosely  speaking,  the  decision  is  taken  exhibiting  smaller  accumulated 
loss  in  the  past  test  steps,  where  "nearly  the  same"  sample  was  observed. 
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II.  THE  GENERIC  DECISION  PROBLEM 

Throughout  this  paper  (q,,*,p)  will  always  denote  the  basic  prob¬ 
ability  space.  The  expectation  of  a  random  variable  X  will  be  denoted 
by  E(X),  the  conditional  expectation  given  a  sub-<j-field  by  E { X | S' 3  . 
The  symbol  I  (•)  will  denote  the  indicator  function  of  a  set  A.  Other 
symbols  or  definitions  will  either  be  introduced  later  or  used  according 
to  [4]. 

The  basic  decision  problem  generating  the  sequence  is  defined  as 
follows. 

Let  (E,  3C),  the  sample  space,  be  any  measurable  space  with  the 

onfield  of  subsets  It  generated  by  a  net  of  countable  partitions 

($  :  k=l , 2 , .  . . }  of  the  set  E;  that  is,  the  partitions  “8  = 

K  K 

{B  :  j=l  ,2 , .  .  . } ,  where  0  ^  B  c  H  ;  j=l,2, .  .  .  ;  Bj  k  DB.  k  =  0  for 

J  y  00  ““  1  2 

j,  f  and  ,U.  B.,  =  E,  satisfy  the  conditions: 

A  z  J=1  jk 


(a)  for  every  A  e  ®k  there  exists  B  e  ®k_1  such  that  A  C  B; 

(b)  3t  is  the  minimum  <r-field  over  the  sequence  {$k  :  k=l,2,...) 


(2.1) 


Notice  that 

Let  0 

{v  :  0  €©} 

0 

will  assume 


all  the  sets  B_,,  are  therefore  SC -measurable, 

jk 

be  an  arbitrary  set,  the  parameter  space,  and  let  9  = 
be  a  family  of  probability  measures  on  the  space  (E,.T).  We 
that  ?  is  dominated  by  a  cr-finite  measure  |i  on  (H,3C), 


i.e.,  that 


v  «  n 

e 


for  every  0  e  0  , 


(2.2) 


so  that  the  Radon-Nikodym  derivatives 


»  xe  s.  , 


exist  for  every  0  €  0. 
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Next,  let  D  =  {1,2}  be  the  decision  space  and  let  L  be  a  random 
loss  fraction  defined  on  ©  X  D,  i.e.,  for  every  0e6,  deD,  L(G,d) 
is  a  real-valued  random  variable.  We  will  assume  that  L  is  uniformly 
bounded — more  precisely,  that  there  is  a  finite  constant  further  on 
denoted  by  C  such  that 

|L(@,d)  +  l|  <  C  for  every  0£0,  deD  .  (2.3) 


Let 


0  <  4(0,  d)  =  E(L(©,d )}  ,  96  0,  deD  ,  (2.4) 

To  avoid  the  trivial  case,  we  will  also  assume  that 

i5(0,l)  /  ^(9,2)  for  at  least  one  0e©  .  (2.5) 

Let  J  be  a  cr-field  of  all  subsets  of  ©  and  let  T  be  the  class 

of  all  purely  atomic  finite  signed  measures  on  (©,3")  with  a  finite 

number  of  atoms.  Let  T  C  T  be  the  subclass  of  all  probability  measures. 

o 

If  TeT  has  a  single  atom  {0},  we  will  sometimes  write  simply  0 
o 

instead  of  T.  Further,  we  will  denote 


i(x,d)  =  f  4( 0,d)  dT(e)  ,  TeT,  deD  , 

J© 


and 


A(t)  =  i(x,2)  -  4( T,l)  . 

Notice  that  since  TeT  the  above  integral  exists  and  by  (2.3)  is  uni 
formly  bounded  by  C.  Similarly,  let 

w(x,T,d)  =  f  4(8, d)  5  (x)  dx(e)  ,  xeH,  Te  T,  de  D  , 
Jq  9 
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where,  clearly  w(*,T,d)  is  X-measurable  and  |i-integrable  for  every 
Te  T  and  de  D.  Let 


P 


min  {w(x,Tfd)}  d|i(x) 
d  €  D 


be  the  Bayes  risk  for  the  hypothetical  prior  TeT.  Notice  that,  with 
addition  and  multiplication  by  a  constant  defined  naturally  on  T,  both 
l(T,d)  and  A(t)  are  linear  in  T  r.nd  that  p  is  a  nonnegative  con¬ 
cave  continuous  functional  on  T  bounded  by  C  on  T  . 

o 
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III.  THE  SEQUENTIAL  COMPOUND  DECISION  PROBLEM 

The  sequential  compound  decision  problem  is  obtained  by  infinite 
repericion  of  the  generic  decision  problem  with  0e©  varying  arbitrarily 

along  the  sequence. 

°° 

Let  ©  be  the  set  of  all  sequences 


0  =  {0  :  n=0f 1, . . . }  . 

n 


For  r  nvenience,  we  will  assume  that  0  is  such  that  L(@  ,  •)  =  0. 

o  o 

There  is  no  loss  of  generality  in  this  assumption  since  the  sequence  in 

fact  begins  with  0,  ,  and  0  is  merely  a  dummy  parameter. 

1  o 


Given  a  sequence  0  and  an  interval  of  integers  J  =  (n  ,n  ], 

X  A 

J 


0  <  n  <  n  ,  we  define  Tt  eT_  by 

\  £t 


n. 


tj(a)  =  (v”i )_1  2  V8,,)  •  A£T- 


(3.1) 


n=n^+l 


If  J  =  (o,n],  we  will  write  simply  T  instead  of  T/  i. 

n  vu,nj 

Next,  let  {Xn  :n=0,l,...}  be  a  sequence  of  independent  random 
variables  taking  values  in  (H, 3C)  with 


PX_1  =  v 
n  0_ 


n 


9n£9. 


and  let  (L  (q  ,d  )  :  0  €  0,  d€D)  be  a  sequence  of  independent  random 
n  n  n  n  n 

variables  distributed  as  L(0,d)  whenever  0_  =  0,  d^  =  d.  The  sequences 

(X  :  n=0, 1, . . . } 
n 

independent.  Here,  X 


n  n 

and  {L  (@  ,d  )  :  n=0,l,...)  are  assumed  to  be  mutually 
n  n  n 

is  the  observed  sample  and  L  (0  ,d  )  is  the 
n  n  n  n 


random  loss  for  the  decision  d  at  n-th  step. 


n 
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Let 

* 

a 

be  a  mapping  from 

(-00, 

+00) 

into 

D  defined  by: 

if 

y  >  0 

9 

s*(y)  =<  arbitrary 

if 

y  =  0 

1 

(2 

if 

y  <  0 

• 

We  will  need 

the 

following 

Lemma .  Let 

K 

:  n=0, 1 , . . . )  and 

(V  : 

n 

n=0, 

1, . . .) 

be  mutually  indepen 

dent  sequences  of  Bernoulli  random  variables  with 


P(U'  =  l)  =  P  ,  0<p<i 

n  ~  * 

and 

P(V  =  l)  =  P{V  =  0}  *=  i  ; 
n  n  « 

let 


where 

Y!  =  2p”1U ' [V  ( 1+L  ( 0  , 2 ) )  -  (l-V  )(l+L  (0  ,1))]  . 
n  nnnn  nnn 

Then  there  is  a  finite  constant  depending  only  on  the  constant 

/  v  00 

C  in  (2.3)  such  that  for  every  n=l,2,...,  0e0  , 

n 

I 

i=l 

Proof . 

Let  Y  a  Y'  -  C { Y* ) .  Notice  that  C{ Y' )  =  A(e  )  and  that  the 
n  n  ^  n  n  n 

random  variables  Y  are  uniformly  bounded, 


>  -  min  (l(T  ,d))  <  C  (pn)  ^ 
|  deD 
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where  c  <  +  oo  is  a  constant  independent  of  p  and  0r. 
Further,  elementary  computation  yields 


-  (4C)r  p1 


-r 


9 


r  =  1,2,3  , 


(3.2) 


(3.3) 


and  for  0  <  p  <  '/s 


(3.4) 


Let 


n 


!„  =  ”_!4  2  5i  -  ”  =  • 


i=0 


let  F  denote  the  distribution  function  of  the  law  £(Z  ).  Since  the 
n  ~  n 
random  variables  are  independent,  are  centered  at  expectations,  and 

have  positive  variance,  the  Berry-Esseen  normal  approximation  (see  [4], 

p.  288)  yields 

-3/2  w 


sup  |F~(x)  “  G~(*)|  <p|  E|Y,  '2 

Xe(-00,+0o)  ^_q 


i  ■ 


i=0 


(3.5) 


* 

where  £  is  the  Berry-Esseen  constant,  G  is  the  distribution  function 
of  the  normal  law  7l(o,l),  and  F*  is  the  distribution  function  of  the 
law 
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Denote 


Since, 


n  _1/2 

'»■  ^  i  ; 

/ n  \  s/?  “ 

(2‘iV)  I 

\i=0  /  1=0 


n  =  1,2, 


we  have 


F  (x)  =  F  (a  x)  . 
n  n  n 


Further  by  (3.3)  and  (3.4), 


-3/2 

(l  I  «l\!3  <  ac3(r.,1rl/Vl/2  , 

\  1=0  /  1=0 


so  that  (3.5)  becomes 


sup  F  (x)  -  Q  (a  1  x)  <  8C  p(n>l)  ^  p  ;  n  =  1,2, 
/  \  n  v  n  / 1  “ 

x  e  (-00,+°°; 


Hence  for  any  real  numbers  x^,  x2;  x.^  <  x2  ; 


P(xl^zn<x2j^  (Vxl)  (stt)  pl/2 


+  lec^U+l)"1/2?"1/2  ;  n  =  1,2,  ....  (3.6) 


since 


■‘fc1  X2  )  -  G*(;;1  Xl)  ^  (x2'Xl)^1  ^  (x2"x1)  (^r)  pl/2 


by  (3.4) 
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So  equipped,  we  can  start  proving  the  lemma.  Let  us  denote  the 
left-hand  side  of  the  inequality  to  be  proved  by 


n 


«n  I  -*<Tn>  ' 


) 


i=l 


1 


where 


We  have 


qp(x  )  =  min  U(t  ,d)}  . 
n  ,  _  n 

d  e  D 


n 


- 


i=l 


=  E<n 


-1 


n 


n-1 


^  i^(Ti,Si)  -  ^  ii(Ti,Si)  -  ncp(Tn) 


Li=i 


i=l 


n-1 


=  1 


n_1  ^  i[i(Ti»Si)  "  i(Ti»Si+l)]  +  nCi(Tn,Sn)  "  ^Tn)J 


i=l 


(3.7) 


where  T  €  T  is  defined  by  (3.l). 
i  o 

Before  going  further,  notice  that  for  TeT,  x  a  real  number, 
d_  =  s*(a(x)  +  x)  and  d_  e  D  we  have  the  inequality 


i(T,dx)  -  £(T,d2)  <  x^-d^ 


(3.8) 


13 
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may 


To  see  this,  let  t  £  T  be  such  that  A(t  )  =  x.  For  example,  t 

X  X  ^  X 

have  a  single  atom  (0}  with  t^((0})  »  x[A(e)]  for  some  0  £  6  such 
that  A(s)  ^  0  (  see  assumption  2.4).  Now,  since  A(x+t^)  =  A(t)  +  x, 

we  have  <p(x+t  )  =  ,0(x+t  ,d  )  <  i/(x+t  ,d  )  so  that  by  linearity  of 

X  X  1  X  a 

&(• ,d) 

i(T,dx)  -  ^(x,d2)  <  i(tx,d2)  - 

tx)  if  dx  =  1,  d2  =  2  ; 

tx)  if  dx  =  2,  d2  *  1; 

0  if  dx  =  d2  ; 

which  together  with  A(t  )  =  x  proves  (3.8). 

X 

We  apply  this  inequality  to  the  summands  in  (3.7).  Since  by  defini¬ 
tion 


and 

n  n  n 

n"1  Y'  =  n'1  ^  f{Y^)  +  n”1  ^  Y±  =  A^)  +  n~'/s  Zr  ;  n  =  1,2,...  ; 
1=0  i=0  1=0 


we  can  set  x  =  i  -y  and  d2  =  Si+1  in  (3.8),  thus  obtaining 

i(vsi)  •  ^Tf81+i)  <  i~%zi  (si+rsi)  •  (3-9> 

Similarly,  setting  d_  =  s*[a(x  )]  and  using  the  fact  that  by  the  defini- 

z  n 

tion  of  the  mapping  s*,  i(x,d2)  =  cp(x),  we  obtain 
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TWiwre*  rnzwmmiFQm 


i(Tn>sn)  -  *(Tn)  <n'\[s*(A(Tn))  -  sn] 


Substituting  (3.9)  and  (3.10)  into  (3.7),  we  have 


n-1 


—  i"”1  £  (Si+rSi}  +  nV2Zn[s#(A(Tn))_Sn 


i=l 


=  n*"1  ^  «{[(i-D\_1-i\]Si}  +  n“/a  E{Zn)s*(A(Tn))  , 
i=l 

where  Z^  =  0.  However,  by  definition,  ( i-1  =  anti 

Y^'s  are  centered  at  expectations  E{Z^)  ~  Therefore, 


n 

«„  S  -"'1  I  «Vi5  • 

i=l 


Next,  by  definition  of  Si  and  the  fact  that  E^}  =  0, 


(3.10) 


since 


(3.11) 


(3.12a) 


(3.12b) * 
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depending  on  whether  S.  =  1  or  2  when  the  argument  of  the  mapping 
s  is  zero.  Let  us  consider  the  former  case  first.  We  have 


1  v°  =  {S  • 


and  by  (3.2) 


Hic  £  V'V 


where 


hi=  s  i***-1-  iA(ej)|  * 


h;  -  £  5j  *  -°p‘1  •  I  a(6j) 


Hence 


Yidp 


j  Y^P  -  j  Y“dP  <  j  Y+dP  -  J  Y"c 


N  Y'  >  o 

—  J  ~  | 
>0 


L-ov;i 


X  Y^° 


=  f  Y*dP  +  f  Y  dP  , 

Jh  1  j  H ! 


where  H.^ 
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However,  since  the  random  variables  Y^  are  independent, 


J  Y^dP  =  p(h  )  /  v|dP  <  p(h  )e|y  I  <  4CP(H  ) 
h  .  Jn 


by  (3.3)  and 


f  Y  dP  =  p(h')e{Y  }  =  0 

•'h: 


i  i 

Hi  =  |-c(i-l)“V1-(i“irJ4  ^  A(0j)  <  Zi_1  <  c(i-l)"S"1“(i-l)'^  ]>  A(ej| 

j=0  j=0 


so  that  by  (3.6) 


^  .  1  eo  f  -i  _  \  ^ 


p(Ha )  <  2c(ip)"^  +  16C  p(ip) 


for  i  =  2,3,...  and  trivially  also  for  i  =  1.  Thus  (3.12a)  becomes 


-  E(YiSi)  <  (8Cc  +  64C40)(ip)“!4  • 


(3.13) 


For  the  case  (3.12b),  it  is  easy  to  see  that  exactly  the  same  reasoning 
applies  so  that  (3.13)  holds  again.  Substituting  (3.13)  into  (3.11 ), 
we  have  finally 


4« 

Q  <  (8Cc  +  64C4p)p”^n_1  S  i"^  , 
n  —  t—i 


i=l 


-1  n  -'A  3  -14 

which  together  with  n  ^Z  i  <  —  n  proves  the  lemma 
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IV.  THE  DECISION  RULE 

Let  {M  :  k=l,2,...}  be  a  sequence  of  positive  integers;  let 

K 

(N  :  K=0,1,...)  be  a  sequence  of  integers  defined  by 

N,  =  N  ,  +  M  ,  k  =  1.2,  ....  N  =0; 
k  k-1  k  o’ 

let  J  =  (N  ,N  ]  be  intervals  of  integers. 

tv  K“"  J.  xv 

Let  {U^:  n=0,l,...)  be  a  sequence  of  independent  random  variables 

taking  values  0  and  1  with  PfU  =  1}  =  p,  whenever  n+1  £  J,  ,  let 

n  k  k 

{V^  :  n=0,l,...}  be  a  sequence  of  i.i.d.  random  variables,  tal- Ing  values 
0  and  1  with  probability  %. 

The  sequences  {U  }  and  {V  }  are  supposed  to  be  mutually  indepen- 

n  n 

dent  and  also  independent  on  the  sequences  [X  }  and  {L  }  defined  in 

n  n 

Section  III.  The  random  variable  U  determines  whether  the  n-th  step 

n 

will  be  a  test  step  (un  =  *)  or  an  active  step  (u^  =  0),  and  Vn  deter¬ 
mines  whether  the  decision  d  =  1  (V  =  l)  will  be  used  if  the  n-th  step 

n 

is  a  test  one.  Let  Q  =  (9  :  n=0,l,...j  £  0  be  a  sequence  of  parameters 

n 

and  let 


t  =  V  [l  +  L  (e  ,2)]  -  (i-v  ) [i  +  l  (e  ,i)]  . 
n  n  n  n  n  n  n 


The  sequences  ,  together  with  the  sequence  of  samples  {X^J , 

sent  the  entire  information  the  decision-maker  is  receiving. 

The  decision  rule  is  now  defined  as  follows: 

At  the  n-th  step,  n  €  J  ;  k=l,2,  ...  ; 

K 


l/  if  U  =  1 
'  n 

2  /  if  U  =  0 
'  n 


* 

then  decide  d  =  V  +1; 

n  n 

and  X  £  then  decide 

n  jk 


* 


s 


U  t 

l 


I 


repre- 
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Theorem .  Let 
from  the 
k  -*  +  oo  t 


-if  . 

(d^  :n=l,2,...J  be  the  sequence  of  decisions  resulting 
decision  rule  (*),  where  p  -*  0  and  M  p  -*  + °°  as 

K  K  K 

let  the  assumptions  (2.l)  -  (2.5)  be  satisfied.  Then 


limsup  |  <N 
n-»  +  jo  I 


0 


co 

uniformly  in  all  sequences  Q  £  ©  . 

Proof . 

Let  0  be  a  sequence  from  ©°°.  Since,  by  (*),  d*  and  the  random 

#  ■Jf  ^ 

loss  L  are  independent  E{L  (@,d  ))  =  E{^(9  , d  )).  Let 
n  n  n  n  n 

Y  =  2p n+1  £  J  ,  k  =  1,2 .  (4.1) 

n  k  n  n  K 


and  let 


S 

n 


n 


I  Vb'VI' 

‘=\-i  J 


(4.2) 


whenever 


X 

n 


n  e  J.  , 
k 


k  =  1,2, 


We  have 


j 

»  ' 

| 

i  ^  ) 

I  < 

x'1  I  L„<Vdn>  -  p{\\ 

>=E< 

,"_l  I  U(en’dn)  -  *<VSn)][ 

1 

) 

1 

„=1  ) 

i(e.s  )  - 

n  n 


P<TN>’ 
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Now,  by  (*),  (4.l),  and  (4.2),  =  0,  Xr  e  B  =  sn  »  80  that 

d*  can  differ  from  S  only  if  U  =1.  Therefore 
n  n  n 


<0(9  ,d  )  -  i(e  s  )|  <  2cu 

n  n  n  n  —  n 


whence 


N 


n=l 


)  1 

i  N  ) 

)  <  2CE< 

k1  y » ) 

1  — 

i  ^  ni 

)  1 

n 

c 

_ _ / 

(4.4) 


However  EfU  ]  =  P,  for  (n+l)  e  J,  so  that  the  bound  in  (4.4)  and 
1  n  k  k 

therefore  also  the  first  term  on  the  right-hand  side  of  (4.3)  go  to 
zero  as  k  -*  +°o  .  For  the  second  term,  let  us  denote 


n. 


V.nJ  -•JW1  2  *  p(T( n..n2])  • 

1  z  »  nsnj+l  ■ 


where  0  <  n^^  <  are  integers  and  let  N  e  JR  for  some  K  =  1,2,  ...  . 
Since  p  is  concave  on  Tq,  we  have  the  inequality 


k-1 

p<v  >  1  n'V(V  +  N'1{N-Nk-i)  p(t(n,  ,  ■]> 

k=i  k  k_l  - 


which  implies 


k-1 


"“h  -  2  Vj  +  Ki>  V ,.«] 

k=l  k  * 


where  we  wrote  for  short  instead  of  R 

We  are  going  to  prove  that 


(0,N ] * 


limsup  Rj  =  0 

k  -  +00  k 


00 


uniformly  in  6  €  ©  , 


(4.5) 
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which  in  turn  implies 


limsup  R„  =  0 

N 

n  -+  +oo 


-UU 

uniformly  in  0  €  ©  , 


(4.6) 


To  see  this,  let  e  >  0.  Since  -*•  +oo  as  k  -*  +°of  there  exists  some 
k(e)  such  that 


*k>  (N-Nk.i)>«k(£)=>R(  „ 

k  1  f 


<  € 


oo 


whatever  be  the  sequence  0  €  0  .  Hence,  since 


R(Vi),n]  52C| 


we  have  for  K  large  enough 


N 


-1 


k(e ) 


k-1 


1  \\ +  I 

<=l  k  k=k(e)+l  k 


<  N 


-1 


k-1 


2C 


k(e) 

1  "k  + 
k=l  k=k(e )+l 


I  "k +  (N-Nk-i)<I 


(4.7) 


where  q  <  2C  if  N  -  aad  Q  <  €  if  N  - 

However 


k-1 


N 


=  2 


"k  +  (NA-1  > 


k=l 


so  that  the  first  term  on  the  right-hand  side  of  (4.7)  as  well  as  the  last 

one  in  the  case  N  -  N  <  M  ,  x  can  be  made  arbitrarily  small  while  the 

k-1  “  kU  J 

second  term  and  the  last  one  in  the  case  N  -  Nk_^  >  are  smaller 

than  e.  This  proves  (4.6). 
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Let  us  now  start  proving  (4.5).  To  simplify  notation,  let  us  relabel 


the  subscripts  at  0  and  S  ,  writing 

n  n 


(  “k  ) 

V  =  E>  £x  I  f(^.'8»)> " p(T' 

(  n4i 


)  . 


Let 


Pk(x) 


=  V  min  f  f  (e,d)f  (x)dT(e)d|^(x) 

^  d  e  D  JB  Je  9 

Jk 


j=l 


,  T  €  To,  k 


—  1,2,...  , 


(4.8) 


and  let 


w  (x,T,d)  =  f  i(e,d)f^(x)dT(e)  . 
*/©  0 


where 


‘(ek)(x)  -  2  <x)i 

j  J 


x  e  jz 


the  summation  being  over  tuose  j's  for  which  p(B  )  >  0.  Clearly, 

Jk 

w  (  •  ,T,d)  is  Immeasurable  and  ^-integrable  for  every  T  e  T  ,  d  e  D 

K  O 


and 


P 


k 


min 
d  e  D 


(wk(x,T,d)} 


dp(x)  . 


Let  ?  be  the  minimum  o”field  over  the  partition  18,  .  Since  by  the 
assumption  (2.l)  ^  C  ?2 C  ?3  C  ...  C  (f @  :  k=l,2,  ...)  is  for 

every  0  e  9  a  martingale  sequence  on  the  measure  space  (E,9C,p,)  closed 

on  the  right  by  f  .  Moreover,  f  is  the  nearest  !X-measurable  u-inte- 

0  (k)  ” 

grable  function  closing  {f£  '  :  k=l,2,...)  on  the  right.  This  can  be 
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easily  seen  since  if  g  is  another  such  a  function  closing  the  sequence 


on  the  right,  i.e.,  if 


B  c  -J 


5 


for  all  k  =  1,2,  then  also 


B  e  ^  =■ >  f  gdp  =  f  y  dp 
-'R  J  R  ® 


since  ff  t  +  oo  as  k-* +«  by  (2.1 ).  Hence  the  martingale  closure 
theorem  (see  [4],  p.  394)  applies,  yielding 

f0k)  “  f3 
k  -*  +  oo  y  o 

for  every  0  e  0.  Since  T  e  Tq  has  a  finite  number  of  atoms,  this 
implies  that  also 


1  im  rain  {w  (>,T,d)}  =  w(x,d)  , 
k  -  +oo  d  e  D 


and  since 


|wk(x,T,d)|dn(x);  k  =  1,2,  ...  ; 


are  uniformly  bounded,  it  follows  that  for  every  T  e  T 


lim  pk(l)  =  p(T)  . 
k-*  +oo 


However  p,  p  ,  p  ,  ...  are  continuous  functionals  on  the  compact  T 
1  Z  o 

so  that  the  above  convergence  is  uniform  in  T  e  T  .  Therefore 


Urn  |pk(T  )-p(T  )|  =0 

k  +00  K  K 
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liformly  in  all  sequences  0  €  ©  and,  in  view  of  (4.5),  it  remains  to 


prove  that 


limsup  R'  _  o 


(4.9) 


k  -*•  +  oo 


uniformly  in  0  e  ©  ,  where 


r;  -■K1 2  *(vs»>}  - 


A1k  -  (n  e  •••»  :  Xn  6  Bjk^  ’  AJk 


and  let  mjk;  j  =  1,2,  ...  ;  k  =  1,2,  ...  ;  be  the  sub-g-field  of  A 


induced  by  the  family 


lvx>} . vM  ■ 


'K1 2  <(v8.>} -  s{2  “k’1  2  J(e.'s.) 


■  2  *k1  2  • 
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) 

m 


l 


I 


where  the  series  -n  the  last  term  absolutely  converges  since  the  summands 
are  bounded  by  C  which  sum  up  to  the  constant  C.  Next  by 


(4.8), 


OC 


M. 


pk(Tjk} 


=  Y  min  <M  1  y  4(0  ,d)  E  IB  (X  )) 

^  deD  k  ^  n  BJk  n 

j=l  (  n=l 


00 


00 


00 


^H»(TA  )!, 

J=1  1  '  n,;AJk  I  J=1 


where 


■p(T)  =  min  (4(T,d)}  . 
d  t  D 


Thus 


^(e.s  )  -  cp(ta  ) 
n  n  Ajk  t 


i 

f 

00  1 

00 

/ 

■I- 

E 

-  <P(tA  V 
.Ik  l 

J=1  1 

[ 

"£Ajk 

- 

Applying  now  the  Lemma  of  Section  2  to  the  terms  under  the  first  expecta¬ 
tion  sign,  we  obtain 


(4.10) 
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|.  ,^p";  t  ..'!^f<?! 


where 


(•)  =  f  v  (.)dT(e)  . 
•'e  ® 


The  summands  in  (4.10),  however,  are  also  bounded  by 


2CE 


f  jA1}  = 


2CvT  (Bjk)  . 
Jk 


which  sums  up  to  the  constant  2C  since  vT  is  a  probability  measure. 

Jk 

Thus  the  dominated  convergence  theorem  applies  to  (4.10)  and  together 
with  the  assumption  <Vk>  ^  +  00  yield  (4.9). 

The  theorem  is  proved. 
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V.  FINAL  REMARKS 

Let  us  make  a  few  remarks  concerning  the  problem.  First  notice  that 
the  decision  rule  suggested  here  does  not  make  use  of  all  the  information 
available  to  the  decision-maker  since  the  information  obtained  when  active 
decisions  are  made  is  disregarded.  Thir  has  been  done  mainly  for  the  sake 
of  the  proof;  it  is,  nevertheless,  conceivable  that  if  the  disregarded 
information  were  used  the  convergence  might  be  faster.  The  rate  of  con¬ 
vergence  itself  would  be  worth  investigating.  It  was  shown  under  a  simi- 

lar  assumption  in  a  game  situation  [9]  that  R  =  o(n  '  )  (for  a 

n 

special  choice  of  V  .  Here,  the  rate  would  probably  also  depend  on  the 
partitions  $  and  possibly  on  the  family  9.  Reference  [9]  also  indi- 

K 

cates  that  the  convergence  of  average  losses  instead  of  risks  may  also  be 
proven.  However,  the  question  of  modifying  the  decision  rule  so  as  to 
use  the  whole  of  the  past  information  obtained  and  not  to  begin  at  each 
block  from  a  scratch  remains  open  for  the  time  being. 
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